Nicholas Ezra Pfaff

I am a second-year PhD student at the Massachusetts Institute of Technology, where I am advised by Prof. Russ Tedrake in the Robot Locomotion Group.
My research focuses on enabling realistic simulations at scale to generate training data for robotic manipulation foundation models. I enjoy blending model-based and learning-based approaches to tackle these challenges effectively.
Before joining MIT, I completed my MEng at Imperial College London and spent nine months at Ocado Technology, where I worked on parallel yaw pick-and-place systems at scale.
Outside of academia, I am passionate about rock climbing, backpacking, and specialty coffee.

Recent Events

  • September 2024
    I was a TA for MIT's Robotic Manipulation class
  • May 2024
    Joined Toyota Research Institute's Large Behavior Models team in Los Altos for an internship
  • September 2023
    Joined the Robot Locomotion Group for my Ph.D.
  • September 2022
    Received the 2022 Head of Department's prize from Imperial College London for being the top performing student in the third year of my course

Projects

An overview of some of the projects I worked on before my PhD. They are roughly ordered in order of robotic manipulation relevance.

UR10e suction cup robot

One of Ocado's suction cup robots which is similar to the parallel jaw gripper version that I worked on.

Parallel Jaw System Robustness and Grasp Reasoning Improvements

Ocado Technology is developing an intelligent parallel jaw robotic system that can autonomously pick and pack grocery orders. The team's objectives are to build a system to pick and pack every SKU (stock-keeping unit) that is pickable with a parallel jaw gripper while operating at a human-like speed. I helped the team improve the system's robustness, leading to a significant decrease in both average cycle time and system downtime. This involved improving the robot controllers and implementing problem detection and recovery behaviours. I additionally focused on redesigning the grasp reasoning component to improve the pick success rate and speed. This consisted of both investigating and implementing new reasoning approaches.

Positions modes.

The positions modes for the finger-box contact pair. Within each position mode, there are six possible contact modes which depend on the magnitude of the force between the bodies that make up the contact pair. We used GCS to optimally plan through these position and contact modes.

Automating Optimal Demonstrations: The Power of Graphs of Convex Sets

Behavior cloning for training visuomotor policies has become a popular framework in robotics. Yet, behavior cloning heavily depends on the quality of human demonstrations that tend to be both sup-optimal and time-consuming to collect. We proposed to use Graphs of Convex Sets (GCS) to automatically create optimal demonstrations and use these demonstrations to train a Diffusion Policy. In particular, we used GCS to create plans through contact using full-state feedback in simulation. We then used the resulting trajectories to train a Diffusion Policy from only visual observations (teacher-student setting). We show that the planar pushing trajectories executed by the Diffusion Policy are close to the optimal ones that GCS would have run for the same initial conditions while not requiring any human demonstrations. In doing so, we revealed this novel paradigm’s potential to overcome the many downsides of human-generated demonstrations. [GitHub] [video] [ report]

Simple Grasp Reasoning Visualizer

A simple grasp reasoning visualizer from MIT's course on Robotic Manipulation which is shown as a replacement for the actual visualizer for confidentiality reasons.

Bin Picking Reasoning Visualizer

I developed a flexible 3D visualizer with a GUI for the bin picking reasoning component of Ocado Technology's parallel jaw robot system. The visualizer allowed the inspection of intermediate results and provided an interface for parameter tuning and geometric highlighting. Moreover, it was designed to be easily extendable to new kinds of data that might come with future algorithms. With the help of this visualizer, the parallel jaw team could accelerate the development and debugging of grasp generation algorithms.

Tenaci OpenMANIPULATOR-X robot making breakfast

Tenaci making breakfast and writing a good morning message.

Tenaci

I developed a robotic manipulation system based on the OpenMANIPULATOR-X as part of a course taught by Dr. Ad Spiers. Tenaci could rotate and stack cubes, draw sketches, and prepare a miniature breakfast. The project consisted of both modelling and programming the robot. All software apart from the Dynamixel servo library had to be written from scratch. I assigned the DH frames and determined the analytical inverse kinematics to model the 4 DOF (plus one DOF gripper) robot. To complete the tasks, I designed a flexible motion planning system based on a LEGO concept: High-level tasks (e.g. stacking cubes) were constructed from a sequence of simple actions. The waypoints produced by the motion planning algorithm were then converted into trajectories using cubic spline interpolation in either task or joint space. [GitHub] [video] [ report]

Neural Field Decoder Architecture

Our neural field decoder architecture.

3D Neural Scene Representations for View-Invariant State Representation Learning

Performing tasks from visual input is a key goal in robotics. Such vision-based policies have improved significantly over the last few years. Yet, they tend to be sensitive to changes in camera pose during test time. We explored leveraging 3D neural scene representations to cultivate a camera-viewpoint-invariant state representation, enhancing resilience to camera pose variations. We underscored the efficacy of using these 3D scene representations for view-invariant state representation learning through comparisons with multiple baselines. Notably, integrating these 3D models with contrastive learning yielded especially promising results. [GitHub] [report]

image2inertia architecture diagram

Our model architecture. Blue boxes are systems, red ovals are predicted quantities, and green ovals are ground truth quantitites.

image2inertia: Predicting Object Inertia from RGB Images

When humans interact with the world, they can predict an object's motion from vision alone. This intuition comes from years of prior experience manipulating objects and mapping the resulting observations back to the perceptual system. Rigid object motion in response to an applied force depends on the object's mass, its center of mass, and its moment of inertia. Previous works have focused on estimating an object's mass from images alone. We explored the more challenging problem of estimating an object's moment of inertia from a single image. In particular, we investigated whether we can use 3D priors to predict inertia more accurately. Our method introduced these 3D priors by utilizing a pre-trained object-centric neural radiance field model to construct a 3D representation that we then used to predict a physical mass density field. We then obtained the inertia by numerically integrating the predicted density field.

Mars rover side view

A side view of the mars rover.

Mars Rover

I led a team of six to create a fully autonomous mobile robot that could build a map of an unknown terrain while identifying coloured balls. My main contribution was designing and implementing a task-based FreeRTOS application on an ESP32 for connecting the independent modules (drive, energy, server, and vision). This involved evaluating and setting up different communication protocols (UART, SPI, and MQTTS), defining the data encodings, and mocking modules to enable independent testing. Furthermore, I implemented the robot automation algorithms (automatic next destination, shortest path, and path to drive instructions) and tested them using extensive unit tests. [GitHub] [video] [report]

Navigation challenge arena

The CoppeliaSim competition arena. The Monte Carlo localization particles are visualized around to the robot.

Navigation Challenge

I scored second place in Prof. Andrew Davison's robotic navigation challenge, which completed his course on mobile robotics. The challenge consisted of the robot being placed randomly on a known map with a noisy floor. Points were awarded for reaching all five goal locations with a distance of less than 5cm and completing this as quickly as possible. My implementation was based on Monte Carlo localization. The algorithm started with a large number of particles to achieve initial global localization. It then decreased the number of particles and sensor measurements per measurement update step to improve speed. [GitHub] [video]

Sumobot robot and controller

A top view of the robot and remote controller.

Sumobot

I recruited and led a group of four to place fourth overall in Imperial College London's inter-university Sumobot Competition. The competition consisted of rounds of remote-controlled robots fighting each other in a circular arena. I developed the robot's control and communication software. [GitHub] [video]

Online poker with FPGA controllers system diagram

The high-level system diagram, showing the functions of the individual components and the connections between them.

Online Poker Game with FPGAs as Controllers

I worked in a team of six to create an online poker game using FPGAs as controllers. Players could tilt their FPGA to view their cards and select the amount to bet (implemented using the onboard accelerometer). I was responsible for creating the system's server component. The server ran the game logic, kept track of the game and player states, and provided an HTTP API for client and webpage communication. The server also exposed a credential service for creating new users. I wrote extensive Bash testing scripts that simulated the FPGAs and enabled random and interactive game flow testing. [GitHub] [report]

LTS_V2 simulated circuit diagram

One of the circuit diagrams that we tested our simulator with.

Circuit Simulator

I worked in a team of three to develop a SPICE-style circuit simulator. We added support for most linear and non-linear circuit elements, such as current sources, ideal op-amps, capacitors, MOSFETs, and BJTs. We also implemented advanced algorithms such as dynamic time stepping and source stepping to improve the simulator's accuracy and convergence. Our simulator was highly modular, making it easy to add new components. [GitHub] [video] [report]

Neo4j graph visualization

Visualization of a Neo4j graph database.

Graph Representation of the Internet

I created a dynamic graph representation of the Internet (BGP data, CAIDA data, etc.) to facilitate Netcraft's search for malicious autonomous systems (AS) and enable their takedown. Netcraft was particularly interested in finding AS that hide their connections to legitimate internet service providers (ISPs) behind multiple intermediaries. This is difficult to achieve with relational databases. At the end of my internship at Netcraft, the database contained around 500 thousand nodes and two million edges.

V60 coffee brewing setup

A V60 coffee brewing setup.

Buna - Coffee Tracker

I independently developed a coffee tracker for speciality coffee enthusiasts. The tracker supported various types of filter coffee and hand-operated lever espresso machines. The project focused on creating a lightweight interface that allowed the user to rapidly retrieve and enter brewing and tasting data. I used the application multiple times a day for around half a year while learning how to prepare the perfect coffee. [GitHub]

Compiler operation diagram

A diagram showing the operations of a typical compiler, taken from Dr. John Wickerson's course on compilers.

C to MIPS Compiler

I worked in a team of two to build a compiler from C90 to MIPS assembly. We implemented most C90 language features, including global and local scopes, control flow, functions with a variable number of arguments, recursive function calls, n-dimensional arrays, enums, switch statements, typedef, sizeof, pointer arithmetic, structs, nested structs, and unnamed structs. I also developed a testing script for automatically assessing the compiler against numerous C90 test cases. [GitHub]

MIPS CPU diagram

A diagram showing our CPU's architecture.

MIPS CPU Testbench

I worked in a team of five to create a 32-bit Avalon bus interface compatible CPU based on a subset of the MIPS1 ISA. I was responsible for designing and implementing the CPU's extensive testing architecture. This involved writing Bash testing scripts, developing a CPU simulator for generating a reference output from MIPS assembly, and devising a comprehensive list of test assembly files focused on identifying corner cases. [GitHub]

CovidDB web application

A page of the CovidDB web application that was re-used from the tumour documentation system that I worked on.

Medical Documentation System

I worked with the team at ClarData to develop the TuDoc Breast web application, a tumour documentation system for hospitals. I started the project by creating the essential infrastructure and later focused on the functionality of specific pages, such as the one shown. My work was later re-used for other medical documentation systems such as the CovidDB.

Hobbies

I enjoy being active in my free time. At the moment, this includes bouldering multiple times a week, while in the past, it included Bachata dancing, competitive long-distance running, and competitive rowing. During holidays, I tend to trek, backpack, or cycle across the world.

Camping in Bulgaria
Bouldering indoors
Kilimanjaro Uhuru peak
Imperial College London Dance Showcase